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1. Introduction 



Strongly disordered systems such as spin glasses represent some of the most interesting and 
most difficult problems of statistical mechanics. Amongst the most remarkable achievements of 
theoretical physics in this field is the exact solution of some models of mean field type via the replica 
trick and Parisi's replica symmetry breaking scheme (For an exposition see [MPV]; the application 
to the Hopfield model [Ho] was carried out in [AGS] ) . The replica trick is a formal tool that allows 
to eliminate the difficulty of studying disordered systems by integrating out the randomness at the 
expense of having to perform an analytic continuation of some function computable only on the 
positive integers to the value zero 1 . Mathematically, this procedure is highly mysterious and has 
so far resisted all attempts to be put on a solid basis. On the other hand, its apparent success 
is a clear sign that something ought to be understood better in this method. An apparently less 
mysterious approach that yields the same answer is the cavity method [MPV]. However, here too, 
the derivation of the solutions involves a large number of intricate and unproven assumptions that 
seem hard or impossible to justify in general. 

However, there has been some distinct progress in understanding the approach of the cavity 
method at least in simple cases where no breaking of the replica symmetry occurs. The first at- 
tempts in this direction were made by Pastur and Shcherbina [PS] in the Sherrington-Kirkpatrick 
model and Pastur, Shcherbina and Tirozzi [PST] in the Hopfield model. Their results were con- 
ditional: They assert to show that the replica symmetric solution, holds under certain unverified 
assumption, namely the vanishing of the so-called Edwards- Anderson parameter. A breakthrough 
was achieved in a recent paper by Talagrand [Tl] where he proved the validity of the replica sym- 
metric solution in an explicit domain of the model parameters in the Hopfield model. His approach 
is purely by induction over the volume (i.e. the cavity method) and uses only some a priori es- 
timates on the support properties of the distribution of the so-called overlap parameters as first 
proven in [BGP1,BGP2] and in sharper form in [BG1]. 

Let us recall the definition of the Hopfield model and some basic notations. Let Sn = {—1,1}^ 
denote the set of functions a : {1, . . . ,N} — > { — 1, 1}, and set S = { — 1, l} m . We call a a spin 
configuration and denote by the value of a at i. Let (£l,J r ,IP) be an abstract probability space 
and let i,fi £ IN, denote a family of independent identically distributed random variables on 
this space. For the purposes of this paper we will assume that iP[£f = ±1] = \- We will write 
^[uj] for the iV-dimensional random vector whose i-ih component is given by [lo] and call such 

1 As a matter of fact, such an analytic continuation is not performed. What is done is much more subtle: The 
function at integer values is represented as some integral suitable for evaluation by a saddle point method. Instead of 
doing this, apparently irrelevant critical points are selected judiciously and the ensuing wrong value of the function 
is then continued to the correct value at zero. 
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a vector a 'pattern'. On the other hand, we use the notation ^[uj] for the M-dimensional vector 
with the same components. When we write £[uj] without indices, we frequently will consider it as 
anMxJV matrix and we write £*[u/] for the transpose of this matrix. Thus, £*M£M is the M x M 
matrix whose elements are 

£<=i Ci'MCM- With this in mind 

we will use throughout the paper a 
vector notation with (■, ■) standing for the scalar product in whatever space the argument may lie. 
E.g. the expression (y, £«) stands for £^i£fyu, e * c - 

We define random maps m^[u;] : Sjy — ► [—1, 1] through 2 

1 N 

i=l 

Naturally, these maps 'compare' the configuration a globally to the random configuration ^[u]. A 
Hamiltonian is now defined as the simplest negative function of these variables, namely 

N M(N) 

H N [ U ]{*) = -£L £ «H(a)) 2 

Z „=i (1.2) 

N 2 
= -y NjvM(o-)|| 2 

where M(N) is some, generally increasing, function that crucially influences the properties of the 
model. || • || 2 denotes the £ 2 -norm in JR M , and the vector mjvM(cr) is always understood to be 
M(iV)-dimensional. 

Through this Hamiltonian we define in a natural way finite volume Gibbs measures on Sn via 

^MWE-^e-^HW (1.3) 

and the induced distribution of the overlap parameters 

QnA u \ = Miv./sM o win^]' 1 (1.4) 
The normalizing factor Zn^[uj], given by 

Z NtP [u] = 2~ N = JB ff e-^["IW (1.5) 

is called the partition function. We are interested in the large iV behaviour of these measures. 
In our previous work we have been mostly concerned with the limiting induced measures. In this 
paper we return to the limiting behaviour of the Gibbs measures themselves, making use, however, 
of the information obtained on the asymptotic properties of the induced measures. 



2 We will make the dependence of random quantities on the random parameter U) explicit by an added [ui] 
whenever we want to stress it. Otherwise, we will frequently drop the reference to LO to simplify the notation. 



2 



We pursue two objectives. Firstly, we give an alternative proof of Talagrand's result (with 
possibly a slightly different range of parameters) that, although equally based on the cavity method, 
makes more extensive use of the properties of the overlap-distribution that were proven in [BG1]. 
This allows, in our opinion, some considerable simplifications. Secondly, we will elucidate some 
conceptual issues concerning the infinite volume Gibbs states in this model. Several delicacies in 
the question of convergence of finite volume Gibbs states (or local specifications) in highly disordered 
systems, and in particular spin glasses, were pointed out repeatedly by Newman and Stein over 
the last years [NS1,NS2]. But only during the last year did they propose the formalism of so-called 
"metastates" [NS3,NS4,N] that seems to provide the appropriate framework to discuss these issues. 
In particular, we will show that in the Hopfield model, this formalism seems unavoidable for spelling 
out convergence results. 

Let us formulate our main result in a slightly preliminary form (precise formulations require 
some more discussion and notation and will be given in Section 5). 

Denote by m* (/?) the largest solution of the mean field equation m = tanh(/?m) and by e M the 
/x-thunit vector of the canonical basis of IR M . For all (//, s) € {-1, 1}x{1, . . . ,M} let B^' a) C JR M 
denote the ball of radius p centered at sm*e^. For any pair of indices (fx, s) and any p > we 
define the conditional measures 

*;! P H(i) = i Bp*), a e b({-i, in (i.6) 

The so called "replica symmetric equations" 3 of [AGS] is the following system of equations in 
three unknowns mi,r, and q, given by 

mi = J dN{g) tanh(/3(mi + y/ccrg)) 
q = J ' (W(g)tanh 2 (P(m 1 + s/ccrg)) (1.7) 



(1-/3 + /3 9 )2 

With this notation we can state 

Theorem 1.1: There exists a nonempty connected set of parameters (3, a bounded by the curves 
a = 0, a = c(m*((3)) 4 and (3 = da, such that if limjvfoo M{N)/N = a the following holds: For 
any finite I C IN, and for any si C { — 1, l} 1 , 



3 We cite these equations, (3.3-5) in [AGS] only for the case k = 1, where k is the number of the so-called 
"condensed patterns". One could generalize our results presumably measures conditioned on balls around "mixed 
states", i.e. the metastable states with more than one "condensed pattern", but we have not worked out the details. 
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as N ] oo, where the gi, i & I are independent gaussian random variables with mean zero and 
variance one that are independent of the random variables I. The convergence is understood 

in law with respect to the distribution of the gaussian variables gi. 

This theorem should be juxtaposed to our second result: 

Theorem 1.2: On the same set of parameter as in Theorem 1.1, the following is true with 
probability one: For any finite I C IN and for any x G IR 1 , there exist subsequences Nk[uj] j oo 
such that for any si C { — 1, 1} 1 , if a > 0, 

lim^'f) = SI }) =TT e " % *] . (1.9) 

fcToo p w*M./3.p l JVI 1S > ll2cosh(xi) v ' 



The above statements may look a little bit surprising and need clarification. This will be the 
main purpose of Section 2, where we give a rather detailed discussion of the problem of convergence 
and the notion of metastates with the particular issues in disordered mean field models in view. We 
will also propose yet a different notion of a state (let us call it "superstate"), that tries to capture the 
asymptotic volume dependence of Gibbs states in the form of a continuous time measure valued 
stochastic process. We also discuss the issue of the "boundary conditions" or rather "external 
fields", and the construction of conditional Gibbs measures in this context. This will hopefully 
prepare the ground for the understanding of our results in the Hopfield case. 

The following two section collect technical preliminaries. Section 3 recalls some results on the 
overlap distribution from [BG1-3] that will be crucially needed later. Section 4 states and proves a 
version of the Brascamp-Lieb inequalities [BL] that is suitable for our situation. 

Section 5 contains our central results. Here we construct explicitly the finite dimensional 
marginals of the Gibbs measures in finite volume and study their behaviour in the infinite volume 
limit. The results will be stated in the language of metastates. In this section we assume the 
convergence of certain thermodynamic functions which will be proven in Section 6. Modulo this, 
this section contains the precise statements and proofs of Theorems 1.1 and 1.2. 

In Section 6 we give a proof of the convergence of these quantities and we relate them to the 
replica symmetric solution. This sections is largely based on the ideas of [PST] and [Tl] and is 
mainly added for the convenience of the reader. 

Acknowledgements: We gratefully acknowledge helpful discussions on metastates with Ch. New- 
man and Ch. Kiilske. 
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2. Notions of convergence of random Gibbs measures. 

In this section we make some remarks on the appropriate picture for the study of limiting 
Gibbs measures for disordered systems, with particular regard to the situation in mean- field like 
systems. Although some of the observations we will make here arose naturally from the properties 
we discovered in the Hopfield model, our understanding has been greatly enhanced by the recent 
work of Newman and Stein [NS3,NS4,N] and their introduction of the concept of "metastates" . 
We refer the reader to their papers for more detail and further applications. Some nice examples 
can also be found in [K,BGK]. Otherwise, we keep this section self-contained and geared for the 
situation we will describe in the Hopfield model, although part of the discussion is very general 
and not restricted to mean field situations. For this reason we talk about finite volume measures 
indexed by finite sets A rather then by the integer N. 

Metastates. The basic objects of study are finite volume Gibbs measures, /j,a,/3 (which for con- 
venience we will always consider as measures on the infinite product space Soo). We denote by 
(A4i(iSoo), Q) the measurable space of probability measures on equipped with the sigma-algebra 
Q generated by the open sets with respect to the weak topology on A4i (tSoo) 4 . We will always regard 
Gibbs measures as random variables on the underlying probability space (O, IP) with values in 
the space A4i(5oo), i.e. as measurable maps f2 — ► A4i(5oo). 

We are in principle interested in considering weak limits of these measures as A f oo. There 
are essentially three things that may happen: 

(1) Almost sure convergence: For iP-almost all oj, 

iUaM^MooM (2.1) 
where /Uoofw] may or may not depend on to (in general it will). 

(2) Convergence in law: 

MA Moo (2.2) 

(3) Almost sure convergence along random subsequences: There exist (at least for almost all u>) 
subsequences Aj[w] | oo such that 

A^MM-^Moo.fAiMiM (2.3) 

In systems with compact single site state space, (3) holds always, and there are models with 
non-compact state space where it holds with the "almost sure" provision (see e.g. [BK]). However, 

4 Note that a basis of open sets is given by sets of the forms Af fl ,...,/ fc , e (/x) = {// V x < f < fc \n(fi)—lJ-'(fi) I < e }> where 
fi are continuous functions on S°°; indeed, it is enough to consider cylinder functions. 
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this contains little information, if the subsequences along which convergence holds are only known 
implicitly. In particular, it gives no information on how, for any given large A the measure [i\ 
"looks like approximately". In contrast, if (i) holds, we are in a very nice situation, as for any large 
enough A and for (almost) any realization of the disorder, the measure /i\M is well approximated 
by //oo[a;]. Thus, the situation would be essentially like in an ordered system (the "almost sure" 
excepted) . It seems to us that the common feeling of most people working in the field of disordered 
systems was that this could be arranged by putting suitable boundary conditions or external fields, 
to "extract pure states". Newman and Stein [NS1] were, to our knowledge, the first to point to 
difficulties with this point of view. In fact, there is no reason why we should ever be, or be able to put 
us, in a situation where (1) holds, and this possibility should be considered as perfectly exceptional. 
With (3) uninteresting and (1) unlikely, we are left with (2). By compactness, (2) holds always 
at least for (non-random!) subsequences A n , and even convergence without subsequences can be 
expected rather commonly. On the other hand, (2) gives us very reasonable information on our 
system, telling us what is the chance that our measure /i\ for large A will look like some measure 
fiao. This is much more than what (3) tells us, and baring the case where (1) holds, all we may 
reasonably expect to know. 

We should thus investigate the case (2) more closely. As proposed actually first by Aizenman 
and Wehr [AW], it is most natural to consider an object K\ defined as a measure on the product 
space £1 <8> A^i(5oo) (equipped with the product topology and the weak topology, respectively), 
such that its marginal distribution on VL is IP while the conditional measure, «a(-)M> on ■M-i{S 00 ) 
given jF 5 is the Dirac measure on hk[uj]; the marginal on .Mi(«Soo) is then of course the law of 
fiA- The advantage of this construction over simply regarding the law of /j,a lies in the fact that 
we can in this way extract more information by conditioning, as we shall explain. Note that by 
compactness K\ converges at least along (non-random!) subsequences, and we may assume that it 
actually converges to some measure K. Now the case (1) above corresponds to the situation where 
the conditional probability on Q given T is degenerate, i.e. 

k(-)H=^h(-), a.s. (2.4) 

Thus we see that in general even the conditional distribution k(-)M of K is a nontrivial measure on 
the space of infinite volume Gibbs measures, this latter object being called the (Aizenman- Wehr) 
metastate 6 . What happens is that the asymptotic properties of the Gibbs measures as the volume 
tends to infinity depend in a intrinsic way on the tail sigma field of the disorder variables, and even 

5 We write shorthand T for Mi(S' >D )®T whenever appropriate. 

6 It may be interesting to recall the reasons that led Aizenman and Wehr to this construction. In their analysis 
of the effect of quenched diorder on phase transition they required the existence of "translation-covariant" states. 
Such object could be constructed as weak limits of finite volume states with e.g. periodic or translation invariant 
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after all random variables are fixed, some "new" randomness appears that allows only probabilistic 
statements on the asymptotic Gibbs state. 

A toy example: It may be useful to illustrate the passage from convergence in law to the Aizenman- 
Wehr metastate in a more familiar context, namely the ordinary central limit theorem. Let 
(fl, T, IP) be a probability space, and let {Xi} i€ ]N be a family of i.i.d. centered random variables 
with variance one; let T n be the sigma algebra generated by Xi, . . . , X n and let T = lining T n . 
Define the real valued random variable G n = Y^i=i -^i- We may define the joint law K n of G n 
and the Xi as a probability measure on IR®rt. Clearly, this measure converges to some measure K 
whose marginal on IR will be the standard normal distribution. However, we can say more, namely 

Toy-Lemma 2.1 In the example described above, the conditional measure k(-)[u\ = K(-\F) satisfies 

k(-)H =A/"(0,1), IP-a.s. (2.5) 



Proof: We need to understand what (2.5) means. Let / be a continuous function on IR. We claim 
that for almost all u, 

J f{x)K{dx)[u>\ = J e -^f(x)dx (2.6) 
Define the martingale h n = J f{x)K{dx,dw\J : n ). We may write 

= }}™ ]E x n+1 ...IE XN f(^= n f; x), a.s. (2-7) 



i=n+l 



f e~ x2 / 2 
= J 



where we used that for fixed N, J27=i converges to zero as N f oo almost surely. Thus, 
for any continuous /, h n is almost surely constant, while lininfoo h n = J f(x)K(dx,duj\J r ) 1 by the 
martingale convergence theorem. This proves the lemma. 

The CLT example may inspire the question whether one might not be able to retain more 
information on the convergence of the random Gibbs state than is kept in the Aizenman-Wehr 
metastate. The metastate tells us about the probability distribution of the limiting measure, but 



boundary conditions, provided the corresponding sequences converge almost surely (and not via subsequences with 
possibly different limits). They noted that in a general disordered system this may not be true. The metastate 
provided a way out of this difficulty. 
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we have thrown out all information on how for a given cj, the finite volume measures behave as the 
volume increases. 



Newman and Stein [NS3,NS4] have introduced a possibly more profound concept of the em- 
pirical metastate which captures more precisely the asymptotic volume dependence of the Gibbs 
states in the infinite volume limit. We will briefly discuss this object and elucidate its meaning in 
the above CLT context. Let A n be an increasing and absorbing sequence of finite volumes. Define 
the random empirical measures k^ 1, (■)[&] on (M.i(S°°)) by 

1 N 

^(■)M^£<WH (2.8) 

n=l 

In [NS4] it was proven that for sufficiently sparse sequences A n and subsequences Ni , it is true that 
almost surely 

lim«^(-)H = «(-)M (2.9) 

Newman and Stein conjectured that in many situations, the use of sparse subsequences would not be 
necessary to achieve the above convergence. However, Kiilske [K] has exhibited some simple mean 
field examples where almost sure convergence only holds for very sparse (exponentially spaced) 
subsequences). He also showed that for more slowly growing sequences convergence in law can be 
proven in these cases. 

Toy example revisited: All this is easily understood in our example. We set G n = ^^[Li^*- 
Then the empirical metastate corresponds to 

1 N 

«SmM = ^£fc„M (2.10) 

n=l 

We will prove that the following Lemma holds: 

Toy-Lemma 2.2 Let G n and K^ re (.)[w] be defined above. Let B t , t G [0,1] denote a standard 
Brownian motion. Then 

(i) The random measures i^fi 1 converge in law to the measure K em = dtS t -i/2 Bt 

(ii) 

lE[ K em (-)\F} =Af(0,l) (2.11) 



Proof: Our main objective is to prove (i). We will see that quite clearly, this result relates 
to Lemma 2.1 as the CLT to the Invariance Principle, and indeed, its proof is essentially an 
immediate consequence of Donsker's Theorem. Donsker's theorem (see [HH] for a formulation in 
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more generality than needed in this chapter) asserts the following: Let r] n (t) denote the continuous 
function on [0, 1] that for t = k/n is given by 

k 

n^kM^^J^Xi (2.12) 

i=l 

and that interpolates linearly between these values for all other points t. Then, rf n (t) converges 
in distribution to standard Brownian motion in the sense that for any continuous functional F : 
C([0, 1]) — > IR it is true that F(rj n ) converges in law to F(B). Prom here the proof of (i) is obvious. 
We have to proof that for any bounded continuous function /, 



(2.13) 



n=l n=l 

J dtf(B t /Vt) = J dtS Bt/Vi (f) 
To see this, simply define the continuous functionals F and Fn by 

F( V )= f dtf( V (t)/V~t) (2.14) 
J o 



and 

N 



Fn(v) = ^ E mn/N)/y/^/N) (2.15) 

n=l 

We have to show that in distribution -F(-B) — Fn{t]n) converges to zero. But 

F(B) - F N (r) N ) = F(B) - F{r, N ) + F( m ) - F N ( m ) (2.16) 

By the invariance principle, F(B) — F(t)n) converges to zero in distribution while F(t]n) — Fn{i]n) 
converges to zero since Fjv is the Riemann sum approximation to F. 

To see that (ii) holds, note first that as in the CLT, the brownian motion B t is measurable 
with respect to the tail sigma-algebra of the Xj. Thus 

IE[K ern \F] =AA(0,1) (2.17) 

❖ 

Remark: It is easily seen that for sufficiently sparse subsequences rii (e.g. = 

1 N 

-AT(0,1), a.s (2.18) 

i=l 

but the weak convergence result contains in a way more information. 
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Superstates: In our example we have seen that the empirical metastate converges in distribution 
to the empirical measure of the stochastic process B t j\ft. It appears natural to think that the 
construction of the corresponding continuous time stochastic process itself is actually the right way 
to look at the problem also in the context of random Gibbs measures, and that the the empirical 
metastate could converge (in law) to the empirical measure of this process. To do this we propose 
the following, yet somewhat tentative construction. 

We fix again a sequence of finite volumes A n 7 . We define for t 6 [0, 1] 

/4„ M = (t- NA0MA [tnl+1 [u] + (1 - t + [tn]/n)ii A[tn] M (2.19) 

(where as usual [x] denote the smallest integer less than or equal to x). Clearly this object is 
a continuous time stochastic process whose state space is M\(S). We may try to construct the 
limiting process 

HM= lim/4>] (2.20) 

n\oo 

where the limit again can in general be expected only in distribution. Obviously, in our CLT ex- 
ample, this is precisely how we construct the Brownian motion in the invariance principle. We can 
now of course repeat the construction of the Aizenman-Wehr metastate on the level of processes. 
To do this, one must make some choices for the topological space one wants to work in. A nat- 
ural possibility is to consider the space C ([0, 1], Mi (<S°°)) of continuous measure valued function 
equipped with the uniform weak topology 8 , i.e. we say that a sequence of its elements Aj converges 
to A, if and only if, for all continuous functions / : 5°° — ► IR, 

lim sup \X itt (f) ~ A t (/)| = (2.21) 

l ^°°te[o,i] 

Since the weak topology is metrizable, so is the uniform weak topology and C ([0, l],Mi(S°°)) be- 
comes a metric space so we may define the corresponding sigma-algebra generated by the open sets. 
Taking the tensor product with our old S7, we can thus introduce the set Mi (C ([0, l],Mi(S°°)) f2) 
of probability measures on this space tensored with Q. Then we define the elements 

K n eMi(C([0,l],Mi(S°°))®n) 

whose marginals on Q are IP and whose conditional measure on C ([0, 1], Mi (S°°)), given T are 
the Dirac measure on the measure valued function /J>A [tn] M> t £ [0, !]• Convergence, and even the 



•7 

The outcome of our construction will depend on the choice of this sequence. Our philosophy here would be to 

choose a natural sequence of volumes for the problem at hand. In mean field examples this would be A„={l,...,n}, 

on a lattice one might choose cubes of sidelength Ti. 

Another possibility would be a measure valued version of the space D([0,l],Mi(S)) of measure valued Cadlag 

functions. The choice depends essentially on the properties we expect from the limiting process (i.e. continuous 

sample paths or not). 
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existence of limit points for this sequence of measures is now no longer a trivial matter. The problem 
of the existence of limit points can be circumvented by using a weaker notion of convergence, e.g. 
that of the convergence of any finite dimensional marginal. Otherwise, some tightness condition is 
needed [HH], e.g. we must check that for any continuous function /, sup| s _ t | <(5 \^\ n {f) — A*a„(/)I 
converges to zero in probability, uniformly in N, as 5 j 0. 9 

We can always hope that the limit as n goes to infinity of K, n exists, //the limit, K. exists, we can 
again consider its conditional distribution given J 7 , and the resulting object is the functional analog 
of the Aizenman-Wehr metastate. (We feel tempted to call this object the "superstate" . Note that 
the marginal distribution of the superstate "at time t = 1" is the Aizenman-Wehr metastate, and 
the law of the empirical distribution of the underlying process is the empirical metastate). The 
"superstate" contains an enormous amount of information on the asymptotic volume dependence 
of the random Gibbs measures; on the other hand, its construction in any explicit form is generally 
hardly feasible. 

Finally, we want to stress that the superstate will normally depend on the choice of the basic 
sequences A n used in its construction. This feature is already present in the empirical metastate. 
In particular, sequences growing extremely fast will give different results than slowly increasing 
sequences. On the other hand, the very precise choice of the sequences should not be important. 
A natural choice would appear to us sequences of cubes of sidelength n, or, in mean field models, 
simply the sequence of volumes of size n. 

Boundary conditions, external fields, conditioning. In the discussion of Newman and Stein, 
metastates are usually constructed with simple boundary conditions such as periodic or "free" ones. 
They emphasize the feature of the "selection of the states" by the disorder in a given volume without 
any bias through boundary conditions or symmetry breaking fields. Our point of view is somewhat 
different in this respect in that we think that the idea to apply special boundary conditions or, in 
mean field models, symmetry breaking terms, to improve convergence properties, is still to some 
extend useful, the aim ideally being to achieve the situation (1). Our only restriction in this is 
really that our procedure shall have some predictive power, that is, it should give information of 
the approximate form of a finite volume Gibbs state. This excludes any construction involving 
subsequences via compactness arguments. We thus are interested to know to what extend it is 
possible to reduce the "choice" of available states for the randomness to select from, to smaller 
subsets and to classify the minimal possible subsets (which then somehow play the role of extremal 
states). In fact, in the examples considered in [K,BGK] it would be possible to reduce the size of 

9 There are pathological examples in which we would not expect such a result to be true. An example is the 
"highly disordered spin glass model" of Newman and Stein [NS5]. Of course, tightness may also be destroyed by 
choosing very rapidly growing sequences of volumes A„ . 
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such subsets to one, while in the example of the present paper, we shall see that this is impossible. 
We have to discuss this point carefully. 

While in short range lattice models the DLR construction gives a clear framework how the 
class of infinite volume Gibbs measures is to be defined, in mean field models this situation is 
somewhat ambiguous and needs discussion. 

If the infinite volume Gibbs measure is unique (for given uj), quasi by definition, (1) must hold. 
So our problems arise from non-uniqueness. Hence the following recipe: modify /xa in such a way 
that uniqueness holds, while otherwise perturbing it in a minimal way. Two procedures suggest 
themselves: 

(i) Tilting, and 

(ii) Conditioning 

Tilting consists in the addition of a symmetry breaking term to the Hamiltonian whose strength 
is taken to zero. Mostly, this term is taken linear so that it has the natural interpretation of a 
magnetic field. More precisely, define 



Here hi is some sequence of numbers that in general will have to be allowed to depend on uj if 
anything is to be gained. One may also allow them to depend on A explicitly, if so desired. From 
a physical point of view we might wish to add further conditions, like some locality of the in- 
dependence; in principle there should be a way of writing them down in some explicit way. We 
should stress that tilting by linear functions is not always satisfactory, as some states that one 
might wish to obtain are lost; an example is the generalized Curie- Weiss model with Hamiltonian 
Hn(o~) = — x[ m A r ( CJ )] 4 a * the critical point. There, the free energy has three degenerate absolute 
minima at —m*, 0, and +m*, and while we might want to think of tree coexisting phases, only the 
measures centered at ±m* can be extracted by the above method. Of course this can be remedied 
by allowing arbitrary perturbation h(m) with the only condition that \\h\loo tends to zero at the 
end. 

By conditioning we mean always conditioning the macroscopic variables to be in some set 
A. This appears natural since, in lattice models, extremal measures can always be extracted 
from arbitrary DLR measures by conditioning on events in the tail sigma fields; the macroscopic 
variables are measurable with respect to the tail sigma fields. Of course only conditioning on 
events that do not have too small probability will be reasonable. Without going into too much of 
a motivating discussion, we will adopt the following conventions. Let A be an event in the sigma 




(2.22) 
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algebra generated by the macroscopic function. Put 



UAA) = -^ ln ^^^) (2-23) 
We call A admissible for conditioning if and only if 

lim f A ,g[u;}(A) = (2.24) 

|A|Too 

We call A minimal if it cannot be decomposed into two admissible subsets. In analogy with (2.22) 
we then define 

AMO = wM(-|i) (2-25) 

We define the set of all limiting Gibbs measures to be the set of limit points of measures /x^ g with 
admissible sets A. Choosing A minimal, we improve our chances of obtaining convergent sequences 
and the resulting limits are serious candidates for extremal limiting Gibbs measures, but we stress 
that this is not guaranteed to succeed, as will become manifest in our examples. This will not mean 
that adding such conditioning is not going to be useful. It is in fact, as it will reduce the disorder 
in the metastate and may in general allow to construct various different metastates in the case of 
phase transitions. The point to be understood here is that within the general framework outlined 
above, we should consider two different notions of uniqueness: 

(a) Strong uniqueness meaning that for almost all lo there is only one limit point ^^[u], and 

(b) Weak uniqueness 10 meaning that there is a unique metastate, in the sense that for any choice 
of A, the metastate constructed taking the infinite volume limit with the measures is the 
same. 

In fact, it may happen that the addition of a symmetry breaking term or conditioning does 
not lead to strong uniqueness. Rather, what may be true is that such a field selects a subset of the 
states, but to which of them the state at given volume resembles can depend on the volume in a 
complicated way. 

If weak uniqueness does not hold, one has a non-trivial set of metastates. 

It is quite clear that a sufficiently general tilting approach is equivalent to the conditioning 
approach; we prefer for technical reasons to use the conditioning in the present paper. We also 
note that by dropping condition (2.24) one can enlarge the class of limiting measures obtainable 
to include metastable states, which in many applications, in particular in the context of dynamics, 
are also relevant. 



10 Maybe the notion of meta-uniqueness would be more appropriate 
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3. Properties of the induced measures. 



In this section we collect a number of results on the distribution of the overlap parameters in 
the Hopfield model that were obtained in some of our previous papers [BG1,BG2,BG3]. We cite 
these results mostly from [BG3] where they were stated in the most suitable form for our present 
purposes and we refer the reader to that paper for the proofs. 

We recall some notation. Let m* ((3) be the largest solution of the mean field equation m = 
tanh(/3m). Note that m*((3) is strictly positive for all [3 > 1, lim^joo m*{[3) = 1, lim^i ^gzr^ = 1 
and m*(/3) = if f3 < 1. Denoting by the /x-th unit vector of the canonical basis of IR M we set, 
for all {p, s) e {-1, 1} x {1, ... , M(N)}, 

and for any p > we define the balls 

B^' s) = \x e ffi M |||x-m^' s) || 2 < p} (3.2) 
For any pair of indices (/j,, s) and any p > we define the conditional measures 

A£M A ) = I B { r } )> e B({-1, 1} N ) (3.3) 

and the corresponding induced measures 

Q%£M(A) = QnA"](A\bM), AeB(IR M ^) (3.4) 

The point here is that for p > c^rpy, the sets Bp 1 '^ are admissible in the sense of the last section. 

It will be extremely useful to introduce the Hubbard-Stratonovich transformed measures 
Qn,p[u] which are nothing but the convolutions of the induced measures with a gaussian mea- 
sure of mean zero and variance l/f3N, i.e. 

Similarly we define the conditional Hubbard-Stratonovich transformed measures 

Q%;tM( A ) = 2"M(A\B^% AeB(m M W) (3 . 6) 
We will need to consider the Laplace transforms of these measures which we will denote by 10 



10 This notation is slightly different from the one used in [BG3] . 
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and 

3£?>K*) - / e (M) ^K P [*) , t e IR M ^ (3.8) 
The following is a simple adaptation of Proposition 2.1 of [BG3] to these notations. 

Proposition 3.1: Assume that (3 > 1. There exist finite positive constants c = c((3), c = c((3),c = 
c((3) such that, with probability one, for all but a finite number of indices N , if p satisfies 

l -m* >p>c {0){ 1 ± r ,f\^} (3.9) 

then, for all t with ^= < oo, 

tfAm) (1 - z-~ cM ) < e-^ ml %#M(t) < z- ZM + Cffijt) (1 + e-* M ) (3.10) 

ii) for any p,p satisfying (3.9) 

3ft>](t) (1 - e"^) < C%%] p [u]{t) < e~™ + C%^] p M(t) (1 + e"^) (3.11) 

Hi) for any p,p satisfying (3.9) 

< ||t|| 2 e- 5M (3.12) 



/' 



l(M> s ) 



-N,j3,pV 



A closely related result that we will need is also an adaptation of estimates from [BG3], i.e. it 
is obtained combining Lemmata 3.2 and 3.4 of that paper. 

Lemma 3.2: There exists j a > 0, such that for all (3 > 1 and y/a < j a (m*) 2 , if c ^ < 
p < m* j\[2 then, with probability one, for all but a finite number of indices N, for all p G 
{1, . . .,M(N)}, s e {-1, 1} ; for allb>0 such that p + b< \/2m* , 

l< Q "llg <l + e-^ (3-13) 

where < c 2 < oo is a numerical constant. 

We finally recall our result on local convexity of the function <!>. 
Theorem 3.3: Assume that 1 < (3 < oo. If the parameters a,(3,p are such that for e > 0, 

inff/3(l - tanh 2 (/3m*(l - r)))(l + 3y/a) 

TV x (3-14) 

+ 2/3tanh 2 (/3m*(l-r))r(a,rm7p)J < 1-e 
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Then with probability one for all but a finite number of indices N, <& n ^[u>](m* 'e 1 + v ) is a twice 
differ entiable and strictly convex function of v on the set {v : \\v\\ 2 < p), and 

Xmin (V 2 $ Jv ^M(m*e 1 + v)) > e (3.15) 

on this set. 

Remark: This theorem was first obtained in [BG1], the above form is cited and proven in [BG2]. 
With p chosen as p = c^, the condition (3.14) means (i) For (3 close to 1: -j-^rp small and, (ii) 
For (3 large: a < c{3~ 1 . The condition on a for large (3 seems unsatisfactory, but one may easily 
convince oneself that it cannot be substantially improved. 
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4. Brascamp-Lieb inequalities. 

A basic tool of our analysis are the so-called Brascamp-Lieb inequalities [BL]. In fact, we need 
such inequalities in a slightly different setting than they are presented in the literature, namely for 
measures with bounded support on some domain D C IR M . Our derivation follows the one given 
in [H] (see also [HS]), and is in this context almost obvious. 

Let D C IR M be a bounded connected domain. Let V G C 2 (D) be a twice continuously 
differentiable function on D, let V 2 V denote its Hessian matrix and assume that, for all x G D, 
V 2 V(x) > c > (where we say that a matrix A > c, if and only if for all v G R M , (v, Av) > c(v, v)). 
We define the probability measure v on (D,B(D)) by 

. , . e~ NV ^d M x 
V{dx) 55 To e-WWx 

Our central result is 

Theorem 4.1: Let v the •probability measure defined above. Assume that f,g G C 1 (D), and 
assume that (w.r.g.) f D dv(x)g(x) = f D du(x)f(x) = 0. Then 

J D dv{x)f{x)g{x) <-L Jjv(x)\\Vf{x)\\ 2 \\Vg{x)\\ 2 

1 I dD \g(x)\\\Vf(x)\\ 2 e- NV ^ d M-i x (4-2) 
+ cN f D e- NV (*)d M x 

where d M ~ x x is the Lebesgue measure on dD. 

Proof: We consider the Hilbert space L 2 (D,IR M ,u) of R M valued functions on D with scalar 
product (F,G) = f D dis(x)(F(x),G(x)). Let V be the gradient operator on D defined with a 
domain of all bounded (7 1 -function that vanish on dD. Let V* denote its adjoint. Note that 
V* = -e^^^'Ve"^^^ = — V + N(W(x)). One easily verifies by partial integration that on this 
domain the operator VV* = Ve NV( ^ Ve~ NV ^ = V*V + NV 2 V(x) is symmetric and V*V > 0, 
so that by our hypothesis, VV* > cN > 0. As a consequence, VV* has a self-adjoint extension 
whose inverse (VV*) -1 exists on all L 2 (D,IR M ,u) and is bounded in norm by {cN)~ r . 

As a consequence of the above, for any / G C 1 (D), we can uniquely solve the differential 
equation 

VV* Vu = Vf (4.3) 
for Vu. Now note that (4.3) implies that V*Vu = / + k, where A; is a constant 11 . Hence for real 



11 Observe that this is only true because D is connected. For D consisting of several connected components the 
theorem is obviously false. 



17 



valued / and g as in the statement of the theorem, 



du(x) (Vg(x), V«(x)) 



dv(x)e 



NV(x) 



D 



&\v(e- NV{x) gVu{x)) + J dv(x)g(x)V*Vu(x) 



= d M x d\v(e- NV{ - x) gVu{x)}+ j du(x)g(x)f(x) 

where Z = J D d M xe~ NV(x \ Therefore, taking into account that Vu = (VV*) _1 V/, 



(4.4) 



dv{x)g(x)f{x) 



< 



D 



dv{x) (V< 7 (x),(VV*)- 1 V/(x)) 



D 



d M x div (e- NV ^gVu(x)^J 



1 

+ Z 

-7n J D Mx)\\Vg(x)h\\Vf(x)\\ 2 



(4.5) 



+ 



^ ^ !<?(*)! \\Vf(x)\\ 2 e- NV ^d M - l x 



Note that in second term we used the Gauss-Green formula to convert the integral over a divergence 
into a surface integral. This concludes the proof. 

Remark: As is obvious from the proof above and as was pointed out in [H], one can replace 
the bound on the lowest eigenvalue of the Hessian of V by a bound on the lowest eigenvalue of 
the operator VV*. So far we have not seen how to get a better bound on this eigenvalue in our 
situation, but it may well be that this observation can be a clue to an improvement of our results. 

The typical situation where we want to use Theorem 4.1 is the following: Suppose we are 
given a measure like (4.1) but not on D, but on some bigger domain. We may be able to establish 
the lower bound on V 2 V not everywhere, but only on the smaller domain D, but such that the 
measure is essentially concentrated on D anyhow. It is then likely that we can also estimate away 
the boundary term in (4.2), either because V(x) will be large on dD, or because dD will be very 
small (or both). We then have essentially the Brascamp-Lieb inequalities at our disposal. 

We mention the following corollary which shows that the Brascamp-Lieb inequalities give rise 
to concentration inequalities under certain conditions. 

Corollary 4.2: Let v be as in Lemma 4-3- Assume that f £ C 1 (D) and that moreover 
Vt(x) = V{x) — tf(x)/N for t G [0, 1] is still strictly convex and \min(V 2 V t ) > d > 0. Then 

[ dv(x)efM - J dv(x)f{x) < ^ 

i . I M t — ■ ,-• , \ i I V \ i ,.\ i :\ I , 



<ln 

+ sup 

te[o,i; 



sup / dv t {x)\\Vf\\l 



1 J dD \g( X )\\\Vf(x)\\ 2 e- NV ^ )d 
c'N f D e- NV *Wd M x 



te[o,i] Jd 

M-l, 



(4.6) 



where v t is the corresponding measure with V replaced by V t . 
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Proof: Note that 



lnIE v e f =IE v f+ f ds f ds' 
Jo Jo 



s'f ( f _ JEye s 'ff \ 
\J E v e°'f ) 



„ ./o ^^ s// (4.7) 



IE v f+ f ds f ds'lE Val (f -IEvj) 
Jo Jo 



2 



where by assumption V s (x) has the same properties as V itself. Thus using (4.2) gives (4.7).<> 

Remark: We would like to note that a concentration estimate like Corollary 4.2 can also be derived 
under slightly different hypothesis on / using logarithmic Sobolev inequalities (see [Le] ) which hold 
under the same hypothesis as Theorem 4.1, and which in fact can be derived as a special case using 
/ = h 2 and g = Inh 2 in Theorem 4.1. 

In the situations where we will apply the Brascamp-Lieb inequalities, the correction terms due 
to the finite domain D will be totally irrelevant. This follows from the following simple observation. 

Lemma 4.3: Let B p denote the ball of radius p centered at the origin. Assume that for all 
x G D, d > V 2 V(x) > c > 0. If x* denotes the unique minimum ofV, assume that \\x*\\2 < p/2. 
Then there exists a constant K < oo (depending only on c and d) such that if p > K^J M/N , then 
for N large enough 

L n e- NV ^d M ~ 1 x 2N/K , 

JdD < P -P N / K A ft) 

] D e~ NV ^)d M x ~ ( ' 



The proof of this lemma is elementary and will be left to the reader. 
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5. The convergence of the Gibbs measures. 

After these preliminaries we can now come to the central part of the paper, namely the study 
of the marginal distributions of the Gibbs measures fi^'j^p- Without loss of generality it suffices to 
consider the case (/i, s) = (1,1), of course. Let us fix / C IN arbitrary but finite. We assume that 
Ad I, and for notational simplicity we put |A| = iV + \I\. We are interested in the probabilities 

(1,1) r wr n- m "^ e U2l {m A ( SI ,, AV )eB^} 

fA,0 , P M (Wi = s i}) = — ; ( 5 - x ) 

1 A \' {mA(s;,(TA\i)eBp } 

Note that ||m/(cr)||2 < \[M. Now we can write 



Then 



1 {m A (s I ,a A \ I )eB ( p 1 ' 1> } - I { mA \/(^)eB^ 1) } 



where p± = p ± — j^-- Setting (3' = jx\(3, this allows us to write 

(i,i) - - • - B " • 



|/|2 



and 



2l'liE ai / B ( lfl) dQx\i,p' (m)e^'l- r l( mj ( CTJ )' m ) e ^^ l|roi(<r/) 
J B (i_,D dQA\/,/3'(m) 

X 

LaA)dQ A \ Ii/3 ,(m) 

< r A/f , Ap+ M(^|J|m 7 («7))e^ l|mj(aj)l1 ' (gfc^) 
" 2l^liE (TI £ A// ^_M(/3'|/|m 7 (^))e /3 ^ ll ^^ )l1 ^ Qa\,,^ (4 M) ) 



(i.i) 



/:A//,^_M(/3 / |/|m J ( g /))e /3 ^ l|mj(sj)l1 - Caw 
2l^liE (TI £ A// ^ + M(/3'|/|m / (^))e^ ll ^^ ) ^ Qa\7,^ 
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(5.3) 



(5.4) 



(5.5) 



Now the term ll m -f( s )lll i s > U P to a constant that is independent of the Sj, irrelevantly small. 
More precisely, we have that 

Lemma 5.1: There exist oo > C, c > such that for all I, M, and for all x > 0, 



IP 



sup ^ 



mi(s)\\l 

< Cexp (-cM (VTTx~ - l) 2 ) 



Mill 

N 



> \I\M 
— N 



w + x 



(5.6) 



Proof: This Lemma is a direct consequence of estimates on the norm of the random matrices 
obtained, e.g. in Theorem 4.1 of [BG6].<C> 

Together with Proposition 3.1 and Lemma 3.2, we can now extract the desired representation 
for our probabilities. 

Lemma 5.2: For all (3 > 1 and y/a < ^ a (m*) 2 , if c ^f < p < m* /y/2 then, with probability 
one, for all but a finite number of indices N , for all p G {1, ... , M(N)}, s G { — 1, 1}, 

(i) 

(M) M „ „ C^l^mrnM)) 

^ M {{ai = SI}) = ^A^Mmi^ (5-7) 

+ 0(iV" 1/4 ) 

and alternatively 

(ii) 

,(l,l) i ,i (f„„ _ „.n _ l ~^/I,0,P 



+ 0(e-°( M )) 



^lf*MW\I\™i{*i)) 
2\i\lE ai £ ( l^Ju;](m™i(vi)) (5.8) 



We leave the details of the proof to the reader. We see that the computation of the marginal 
distribution of the Gibbs measures requires nothing but the computation of the Laplace transforms 
of the induced measures or its Hubbard-Stratonovich transform at the random points t = s i£i- 
Alternatively, these can be seen as the Laplace transforms of the distribution of the random variables 

Now it is physically very natural that the law of the random variables (£j , m) should determine 
the Gibbs measures completely. The point is that in a mean field model, the distribution of the 
spins in a finite set I is determined entirely in terms of the effective mean fields produced by the rest 
of the system that act on the spins cr, . These fields are precisely the (£j , m) . In a "normal" mean 
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field situation, the mean fields are constant almost surely with respect to the Gibbs measure. In the 
Hopfield model with subextensively many patterns, this will also be true, as m will be concentrated 
near one of the values m*e M (see [BGP1]). In that case m) will depend only in a local and very 
explicit form on the disorder, and the Gibbs measures will inherit this property. In a more general 
situation, the local mean fields may have a more complicated distribution, in particular they may 
not be constant under the Gibbs measure, and the question is how to determine this. The approach 
of the cavity method (see e.g. [MPV]) as carried out by Talagrand [Tl] consists in deriving this 
distribution by induction over the volume. [PST] also followed this approach, using however the 
assumption of "self-averaging" of the order parameter to control errors. Our approach consists in 
using the detailed knowledge obtained on the measures Q, and in particular the local convexity to 
determine a priori the form of the distribution; induction will then only be used to determine the 
remaining few parameters. 

Let us begin with some general preparatory steps which will not yet require special properties 
of our measures. To simplify the notation, we we introduce the following abbreviations: 

We write 1E$ N for the expectation with respect to the measures Qk\i,p,h\^\ conditioned on 
Bp and we set Z = Z — 1E$ N Z. We will write IE^ for the expectation with respect to the family 
of random variables £f , i G /, fj, = 1, . . . , M. 

The first step in the computation of our Laplace transform consists in centering, i.e. we write 

E^e^^'V = e^er^^^IE^e^er^^ (5.9) 

While the first factor will be entirely responsible for the for the distribution of the spins, our main 
efforts have to go into controlling the second. To do this we will use heavily the fact, established 
first in [BG1], that on B p 1A) the function $ is convex with probability close to one. This allows 
us to exploit the Brascamp-Lieb inequalities in the form given in Section 3. The advantage of this 
procedure is that it allows us to identify immediately the leading terms and to get a priori estimates 
on the errors. This is to be contrasted to the much more involved procedure of Talagrand [Tl] who 
controls the errors by induction. 

General Assumption: For the remainder of this paper we will always assume that the parameters 
a and f3 of our model are such that the hypotheses of Proposition 3.1 and Theorem 3.3 are satisfied. 
All lemmata, propositions and theorem are valid under this provision only. 

Lemma 5.3: Under our general assumption, 
(i) 

IE^IE^e^e!^^ = e ££, 6 , 11*112 x e O(l/(eA0) ( 5.io) 
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(ii) There is a finite constant C such that 



IE 



In 



.IE,,IE^„e^^" 2 \ 



< 



C_ 

N 



(5.11) 



Remark: The immediate consequence of this lemma is the observation that the family of random 
variables { %)} ieI is asymptotically close to a family of i.i.d. centered gaussian random variables 
with variance Un = 1E$ N \\Z\\2- Un will be seen to be one of the essential parameters that we will 
need to control by induction. Note that for the moment, we cannot say whether the law of the 
(£i,Z) converges in any sense, as it is not a priori clear whether Un will converge as N j oo, 
although this would be a natural guess. Note that as far as the computation of the marginal 
probabilities of the Gibbs measures is concerned, this question is, however, completely irrelevant, 
in as far as this term is an even function of the Sj. 

Remark: It follows from Lemma 5.3 that 

ln2E*„exp = ^-\I\IE^ N \\Z\\ 2 2 + O (^) + R N (5.12) 



,iei 



where 



JE^Rl < ^ (5-13) 



Proof: The proof of this Lemma relies heavily on the use of the Brascamp-Lieb inequalities, 
Theorem 4.1, which are applicable due to our assumptions and Theorem 3.3. It was given in [BG1] 
for / being a single site, and we repeat the main steps. First note that 



4 II y||4 

s.- \\Z\\ 4 



Note first that if the smallest eigenvalue of V 2 $ > e, then the Brascamp-Lieb inequalities Theorem 
4.1 yield 



and by iterated application 



E*AZ\\l<^ + 0{e-W) (5.15) 



E*AZ\\l<^+0{e-f N / K ) (5.16) 



In the bounds (5.14) we now use Corollary 4.2 with / given by j3 2 \I\/2\\Z\\2, respectively by 
l3 2 \I\/2\\Z\\2 — [3 A \I\/A\\Z\\\ to first move the expectation into the exponent, and then (5.15) and 
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(5.16) (applied to the slightly modified measures JE<$> N _ t f / N , which still retain the same convexity 
properties) to the terms in the exponent. This gives (5.10). 

By very similar computations one shows first that 

IE ^ N e^'^' 2) - IE IE^ N e^^' 2) ) < ^ (5.17) 

Moreover, using again Corollary 4.2, one obtains that (on the subspace £l where convexity holds) 

e-^l/H < JE^e^^^e-^W (5.18) 

These bounds, together with the obvious Lipshitz continuity of the logarithm away from zero yield 
(5.11). 

Remark: The above proof follows ideas of the proof of Lemma 4.1 on [Tl]. The main difference 
is that the systematic use of the Brascamp-Lieb inequalities that allows us to avoid the appearance 
of uncontrolled error terms. 

We now turn to the mean values of the random variables IE$ N Z). These are obviously ran- 
dom variables with mean value zero and variance H-ZE^-Z^. Moreover, the variables IE$ N Z) 
and (£j,IE$ N Z) are uncorrelated for i ^ j. Now IE$ N Z has one macroscopic component, namely 
the first one, while all others are expected to be small. It is thus natural to expect that these 
variables will actually converge to a sum of a Bernoulli variable £}IE$ n Zi plus independent gaus- 
sians with variance T/v = S^^I-^^Ar-^] 2 > but it is far from trivial to prove this. It requires in 
particular at least to show that T/v converges. 

We will first prove the following proposition: 

Proposition 5.4: In addition to our general assumption, assume that lim inf jvfoo iV^Tv = +oo ; 
a.s.. For i £ I, set Xi(N) = ^= ^ M=2 JE® N Zv Then this family converges to a family of i.i.d. 
standard normal random variables. 

Remark: The assumption on the divergence of iV^T/v is harmless. We will see later that it is 
certainly verified provided liminf/vtoo A rl / 8 iET/v = +oo. Recall that our final goal is to approximate 
(in law) Y.^=2 iiJE^N 2 ^ b Y VT~N9i, where ft is gaussian. So if T N < iV -1 / 4 , then ^^1 2 ^IE^ N Z tl 
is close to zero (in law) anyway, as is y/T^gi, and no harm is done if we exchange the two. We 
will see that this situation only arises in fact if M/N tends to zero rapidly, in which case all this 
machinery is not needed. 

Proof: To prove such a result requires essentially to show that IE^ N Z^ for all jj, > 2 tend to zero 
as N | oo. We note first that by symmetry, for all ji > 2, IEIE^ N Z^ = IEIE^ N Z2- On the other 
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hand, 



M M 

YyEIE^Ztf < iE^fiE^] 2 < p 2 

H=2 (1=2 



(5.19) 



so that \TEIE^ N Z il \ < P M~ 1 / 2 . 



To derive from this a probabilistic bound on IE$ N itself we will use concentration of measure 
estimates. To do so we need the following lemma: 

Lemma 5.5: Assume that f(x) is a random function defined on some open neighborhood U C IR. 
Assume that f verifies for all x £ U that for all < r < 1, 



IP [\f{x) - IEf(x)\ > r] < cexp - 



Nr< 



(5.20) 



and that, at least with probability 1 — p, |/'(x)| < C, \ f"(x)\ < C < oo both hold uniformly in U. 
Then, for any < ( <l/2, and for any < 5 < N^/ 2 , 



IP 



\f'(x)-IEf'(x)\>5N-^ 2 



< 



32C 2 
1^ 



cxp 



256c 



+ p 



(5.21) 



Proof: Let us assume that \U\ < 1. We may first assume that the boundedness conditions for 
the derivatives of / hold uniformly; by standard arguments one shows that if they only hold with 
probability 1 — p, the effect is nothing more than the final summand p in (5.21). The first step 
in the proof consists in showing that (5.20) together with the boundedness of the derivative of / 
implies that f(x) — IEf(x) is uniformly small. To see this introduce a grid of spacing e, i.e. let 
U e = UD eZ. Clearly 



IP 



sup\f(x)-lEf(x)\>r 

lx€U 



< IP 



sup |/(x) - IEf(x) 

xeu e 



+ sup \f(x)-f(y)\ + \lEf(x)-IEf(y)\>r 

x,y-\x-y\<e 



sup \f(x)-IEf(x)\ >r-2Ce 

_x€U e 



< IP 

<e~ 1 IP[\f(x)-IEf(x)\>r-2Ce] 



(5.22) 



If we choose e = , this yields 



IP 



sup \ f(x) — IEf(x)\ > r 



AC 
< — exp 
r 



Nr 



(5.23) 
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Next we show that if sup xeU \f(x)—g(x)\ < r for two functions /, g with bounded second derivative, 
then 



\f'(x)-g'(x)\<V8Cr~ 



For notice that 



so that 



-[/(* + e)-/(*)] -/'(*; 



— o sup f"(y)<C- 



x<y<x+e 



(5.24) 
(5.25) 



\f'(x) - g'(x)\ < -\f(x + e) - g(x + e) - f(x) + g(x)\ + Ce 

2r „ 
< — + Ce 
e 



(5.26) 



Choosing the optimal e = ^2r/C gives (5.24). It suffices to combine (5.24) with (5.23) to get 



IP 



\f'(x)-IEf'(x)\>VfrC 



AC 
< — exp 

r 



Nr 



(5.27) 



Setting r = -^x, we arrive at (5.21). 

We will now use Lemma 5.5 to control JE^^^Z^. We define 



f( x ) = -Lin / d M ze (3N xz , e -(3N^ N Mz 



(5.28) 



and denote by IE$ NjX the corresponding modified expectation. As has by now been shown many 
times [T1,BG1], f(x) verifies (5.20). Moreover, f'(x) = IE^ NtX Z ll and 



f\x) = (3NIE^ N>X (Z„ - E^Zrf 



(5.29) 



Of course the addition of the linear term to $ does not change its second derivative, so that we 
can apply the Brascamp-Lieb inequalities also to the measure IE$ NjX . This shows that 



2 1 



eN(3 



(5.30) 



which means that f(x) has a second derivative bounded by c = \. 
This gives the 

Corollary 5.6: There are finite positive constants c, C such that, for any < ( < \, for any \i, 

N 1~2C 



IP 



lIE^Zp - IEIE <s>N Z tl \ > N-^ 2 



< CN< exp 



(5.31) 
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We are now ready to conclude the proof of our proposition. We may choose e.g. £ = 1/4 
and denote by Q N the subset of where, for all fi, {IE^j^Z^ — IEIE$ N Z^\ < iV -1 / 8 . Then 
IP[W N ]<o(e- Nl/2 ). 

We will prove the proposition by showing convergence of the characteristic function to that 
of product standard normal distributions, i.e. we show that for any t € IR 1 , IE Y\ jeI e ttjX ^ N ^ 
converges to Ylj eI e~ 2 j ■ We have 



jei 



+ 0(e~ Nl/2 ) 



(5.32) 



n n c ° s 

ix>2j£l 

Thus the second term tends to zero rapidly and can be forgotten. On the other hand, on SI at , 

M M 

^(iE*^) 4 < iV-i/4 ^(lE^Z^ 2 < N-^Tn (5.33) 

/i=2 (1=2 



Moreover, for any finite tj, for N large enough, 
x 2 /2\ < cx 4 for \x\ < 1, and that 



< e ^ 6 



sup 



n exp 



^=IE^ >N Z fM 



< 1. Thus, using that |lncosx 



jei 



<-N 



(5.34) 



— t 2 12 

Clearly, the right hand side converges to e ^' eI j , provided only that iV 1 / 4 !^ T °°- Since this 
was assumed, the Proposition is proven. 

We now control the convergence of our Laplace transform except for the three parameters 
mi(iV) = IE^ N Z 1 , T N = Y^=2 [^Qn 2 ^ 2 and u n = IE^ N \\Z\\l. What we have to show is that 
these quantities converge almost surely and that the limits satisfy the equations of the replica 
symmetric solution of Amit, Gutfreund and Sompolinsky [AGS]. 

While the issue of convergence is crucial, the technical intricacies of its proof are largely 
disconnected to the question of the convergence of the Gibbs measures. We will therefore assume 
for the moment that these quantities do converge to some limits and draw the conclusions for the 
Gibbs measures from the results of this section under this assumption (which will later be proven 
to hold). 

Indeed, collecting from Lemma 5.3 (see the remark following that lemma) and Proposition 5.4, 
we can write 

e P'» T, ie i ^[mi(N)^+X i (N)VT^}+R N ( SI ) 



2 I ]E a e ^E ie / <T 4 m i( Ar )€'+ x ^ Ar )v /7 V]+«iv(^) 
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(5.35) 



where 

Rn(si) —> in Probability 
Xi(N) -> gi inlaw 
Tjv — ► ar a.s. 
mi(JV) — ► mi a.s. 

for some numbers r, mi and there {^ijigw is a family of i.i.d. standard gaussian random variables. 

Putting this together we get that 

Proposition 5.7: In addition to our general assumptions, assume that Tn — > ar, a.s. and 
mi(iV) — ► mi, a.s. Then, for any finite I C IN 

*.Ai, «»i = •/» - n 2cosh ^ [mi§ + ^ < 5 - 36 > 

where the convergence holds in law with respect to the measure IP, and {gi} €€ j^ is a family of 
i.i.d. standard normal random variables and {£}}ieiN ar ^ independent Bernoulli random variables, 
independent of the gi and having the same distribution as the variables . 

To arrive at the convergence in law of the random Gibbs measures, it is enough to show that 
(5.36) holds jointly for any finite family of cylinder sets, {cr, = Sj, Vj e / fc }, Ik C IN, k = 1, . . . ,£ (C.f. 
[Ka], Theorem 4.2). But this is easily seen to hold from the same arguments. Therefore, denoting 
by /i^'/j the random measure 

^ M = 11 2cosh(/3[mi^[-] + ^M]) (5 " 37) 

we have 

Theorem 5.8: Under the assumptions of Proposition 5.7, and with the same notation, 

Ma,/3,p /4w3> inlaw, as A Too , (5.38) 



This result can easily be extended to the language of metastates. The following Theorem gives 
an explicit representation of the Aizenman-Wehr metastate in our situation: 

Theorem 5.9: Let denote the Aizenman-Wehr metastate. Under the hypothesis of 

Proposition 5.7, for almost allu, for any continuous function F : IR k — > IR, and cylinder functions 
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fi on {— 1, l} 1 * , i = 1, . . . , k, one has 

Kp(dn)[u]F (//(/i), . . • , n(fk)) 



L 



Mi(S») 

+mi€jH] 

5 ««■ > II 2« B h(^F„ + m 1 {?M) • < 5 ' 39 > 



/ 



,E., k fk(8i„) n 2cosh( v ^ + m 1 ^]) 



where J\f denotes the standard normal distribution. 

Remark: Modulo the convergence assumptions, that will be shown to hold in the next section, 
Theorem 5.9 is the precise statement of Theorem 1.1. Note that the only difference from Theorem 
5.8 is that the variables that appear here on the right hand side are now the same as those on 
the left hand side. 

Proof: This theorem is proven just as Theorem 5.8, except that the "almost sure version" of the 
central limit theorem, Proposition 5.4, which in turn is proven just as Lemma 2.1, is used. The 
details are left to the reader. 

Remark: Our conditions on the parameters a and (3 place us in the regime where, according to 
[AGS] the "replica symmetry" is expected to hold. This is in nice agreement with the remark in 
[NS4] where replica symmetry is linked to the fact that the metastate is concentrated on product 
measures. 

Remark: One would be tempted to exploit also the other notions of "metastate" explained in 
Section 2. We see that the key to these constructions would be an invariance principle associated 
to the central limit theorem given in Proposition 5.4. However, there are a number of difficulties 
that so far have prevented us from proving such a result. We would have to study the random 
process 

M(tN) 

Xt(N)= (5-40) 

(1=2 

(suitably interpolated for t that are not integer multiples of 1/N). If this process was to converge to 
Brownian motion, its increments should converge to independent Gaussians with suitable variance. 
But 

M(tN) 

X' i (N)-Xf(N)= ti m ^ Z » 

^ M(SN) (5.41) 

M(sN) 
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The first term on the right indeed has the desired properties, as is not too hard to check, but the 
second term is hard to control. 



To get some idea of the nature of this process, we recall from [BG1,BG2] that IE$ N Z is 
approximately given by c(/3)^ YljeA\i £7 ( m ^ ne sense that the £2 distance between the two vectors 
is of order sja at most). Let us for simplicity consider only the case I = {0}. If we replace IE^ N Z 
by this approximation, we are led to study the process 

atN tN 

^W^E^EC (5-42) 

H=2 i=l 

for tN, atN integer and linearly interpolated otherwise. 

Proposition 5.10: The sequence of processes Y t (N) defined by (5.42) converges weakly to the 
gaussian process t~ 1 B at 2 , where B s is a standard Brownian motion. 

Proof: Notice that has the same distribution as £f , and therefore Y t (N) has the same 

distribution as 

atN tN 

?t W E wEE^ ( 5 - 43 ) 

fi=2 i = l 

for which the convergence to B at 2 follows immediately from Donsker's theorem. <0 

At present we do not see how to extend this result to the real process of interest, but at least 
we can expect that some process of this type will emerge. 

As a final remark we investigate what would happen if we adopted the "standard" notion of 
limiting Gibbs measures as weak limit points along possibly random subsequences. The answer is 
the following 

Proposition 5.10: Under the assumptions of Proposition 5.7, for any finite I C IN, for any 

x £ IR 1 , for IP -almost all uj, there exist sequences N)-[uj] tending to infinity such that for any 

si e {-i,iy 



(1,1) 

Psiim^lM+y/a^Xi] (5.44) 



n 



;/ 2cosh((3[m 1 £j[uj] + s/arxi}) 



Proof: To simplify the notation we will write the proof only for the case i = {0}. The general case 
differs only in notation. It is clear that we must show that for almost all 00 there exist subsequences 
Nk[uj] such that X (Nk)[uj} converges to x, for any chosen value x. Since by assumption T/v 
converges almost surely to ar, it is actually enough to show that the variables = ^jTM k X a {Nk) 
converge to x. But this follows from the following lemma: 
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Lemma 5.11: Define Y k = 



y / TN k X (Nf c ). For any x G IR 1 and any e > 0, 
IP [Yk G (x - e,x + e) i.o.] = 1 



(5.45) 



Proof: Let us denote by JF^ the sigma algebra generated by the random variables iAT, i > 1. 

Note that 

IP [X k G (x - e, + e) i.o. ] = 2E (IP [X fe G (x - e, x + e) i.o. | ^}) (5.46) 

so that it is enough to prove that for almost all u, IP [X k G (xo — e,x + e) i.o. | jFg] = 1. 
Let us define the random variables 

M(N k ) 

Y k = Y, ZoIE* Nk Z» (5-47) 

M=M(JV fc _!)+l 

Note first that 



IE{Y k -Y k ) =1E <M(N k _ 1 )IE^ Nk Z 2 j <P 2 ^~ (5-48) 



,N k . 

(1=2 

Thus, if N k is chosen such that YlT=i ^jv" 1 < °°i by the first Borel-Cantelli lemma, 



lim(Yfc -Y" fc ) = a.s. (5.49) 

fc|oo 

On the other hand, the random variables Y k are conditionally independent, given Therefore, 
by the second Borel-Cantelli lemma 

IP [X k G (x - e, x + e) i.o. | J^] = 1 (5.50) 

if 

OO 

Y IP [X k G (x - e, x + e) | ^} = oo (5.51) 

k=l 

But for almost all u, Yk conditioned on T% converges to a gaussian of variance ar (the proof is 
identical to that of Proposition 5.3), so that for almost all u, as k f oo 

J 2 

IP [X k G (x - e, x + e) | T{\ -► / dye"*^ > (5.52) 

V27rar A_ e 

which implies (5.51) and hence (5.50). Putting this together with (5.49) concludes the proof of the 
lemma, and of the proposition. 
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Some remarks concerning the implications of this proposition are in place. First, it shows 
that if the standard definition of limiting Gibbs measures as weak limit points is adapted, then we 
have discovered that in the Hopfield model all product measures on {— 1, l} m are extremal Gibbs 
states. Such a statement contains some information, but it is clearly not useful as information on 
the approximate nature of a finite volume state. This confirms our discussion in Section 2 on the 
necessity to use a metastate formalism. 

Second, one may ask whether conditioning or the application of external fields of vanishing 
strength as discussed in Section 2 can improve the convergence behaviour of our measures. The 
answer appears obviously to be no. Contrary to a situation where a symmetry is present whose 
breaking biases the system to choose one of the possible states, the application of an arbitrarily 
weak field cannot alter anything. 

Third, we note that the total set of limiting Gibbs measures does not depend on the condition- 
ing on the ball Bp 1 ' 1 ^, while the metastate obtained does depend on it. Thus the conditioning allows 
us to construct two metastates corresponding to each of the stored patterns. These metastates are 
in a sense extremal, since they are concentrated on the set of extremal (i.e. product) measures of 
our system. Without conditioning one can construct other metastates (which however we cannot 
control explicitly in our situation). 
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6. Induction and the replica symmetric solution 

We now conclude our analysis by showing that the quantities Un = H^Hl, mi(iV) = 

IE$ N Zi and T/v = 'Yl^^l^ '* jv ^ >P actually do converge almost surely under our general assump- 
tions. The proof consist of two steps: First we show that these quantities are self-averaging and 
then the convergence of their mean values is proven by induction. We will assume throughout this 
section that the parameters a and (3 are such that local convexity holds. We stress that this section 
is entirely based on ideas of Talagrand [Tl] and Pastur, Shcherbina and Tirozzi [PST] and is mainly 
added for the convenience of the reader. 

Thus our first result will be: 

Proposition 6.1: Let Am denote any of the three quantities Un, m\(N) or T/v. Then there are 
finite positive constants c, C such that, for any < ( < \, 



IP 



, / N 1-2C\ 

\A N - IEA N \> N- </2 \ <CN c expl — J (6.1) 



Proof: The proofs of these three statements are all very similar to that of Corollary 5.6. Indeed, 
for nii(N), (6.1) is a special case of that corollary. In the two other cases, we just need to define 
the appropriate analogues of the 'generating function' / from (5.28). They are 

g(x) = ^InlE^^y^ 2 '^ (6.2) 

in the case of Tn and 

~ 9 ( x ) = ^LlnIEz N ]E'zy N ^ (6.3) 

The proof then proceeds as in that of Corollary 6.6. We refrain from giving the details. 

We now turn to the induction part of the proof and derive a recursion relation for the three 
quantities above. In the sequel it will be convenient to introduce a site that will replace the set 
/ and to set £ = V- Let us define 

u n (t) = lniE^e^'^ (6.4) 

We also set vn{t) = t(3(t\,IE<$, n Z) and wn(t) = un(t) — vn{t). In the sequel we will need the 
following auxiliary result 

Lemma 6.2: Under our general assumptions 




vn(t) converges weakly to a standard gaussian random variable. 
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(ii) \4-wn{t) — t(3 2 IEIE^ n II Z\\ 2 1 converges to zero in probability. 



Proof: (i) is obvious from Proposition 5.4 and the definition of vn(t). To prove (ii), note that 

var I id ir I t II < — 

by a standard result similar in spirit to Lemma 5.5 (see e.g. [T2], Proposition 5.4). On the other 



w n (t) is convex and -^w n (t) < Thus, if var (w n (t)) < -§=, then var (-^w n (t)) < j^jj 



hand, \IEwn(t) — ' L -jj— IEIE^ N ||^|||| < -7=, by Lemma 5.3, which, together with the boundedness 
of the second derivative of wn{t) implies that \-^IEwn{t) — t/3 2 IEIE^ n \\Z\\2\ [ 0. This means 
that var (wn(t)) < implies the lemma. Since we already know from G. liter) that IER 2 N < j-, 
it is enough to prove var (lE& N \\ZW2) < -j=- This follows just as the corresponding concentration 
estimate for Un- <0 

We are now ready to start the induction procedure. We will place ourselves on a subspace 
fiCfi where for all but finitely many N \U N - JEU N \ < N' 1 / 4 , \T N - IET N \ < N' 1 / 4 , etc. This 
subspace has probability one by our estimates. 

Let us note that by (iii) of Proposition 3.1, IEq, N Z^ and j dQ^'^ p (m)m^ differ only by an 
exponentially small term. Thus 

m *» z » = iv Etf [ ti'lM*)^ + ( e " cM ) ( 6 - 5 ) 
i=i J 

and, by symmetry, 

IEIE^ N+1 (Z^ = TEvf J ^i\, P , P (da)a + O{e- cM ) (6.6) 
Using Lemma 5.2 and the definition of Un, this gives 

„U N (1) _ p Mjv(-l) 

nsns^zj = W eW(1) + eW( _ 1) + O (e" cM ) (6.7) 

where to be precise one should note that the left and right hand side are computed at temperatures 
(3 and /?' = respectively, and that the value of M is equal to M(N + 1) on both sides; that 
is, both sides correspond to slightly different values of a and j3, but we will see that this causes no 
problems. 

Using our concentration results and Lemma 5.3 this gives 

IEIE^ N+1 (Z^ = TErf tanh (ptfJEm^N) + y/lET N Xo(N)j) +0{N~ l / A ) (6.8) 
Using further Proposition 5.4 we get a first recursion for mi(iV): 

mi(7V + 1) = j oW(g) tanh (^f3(IEm 1 (N) + ^IET N g)^ + o(l) (6.9) 
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Remark: The error term in (6.9) can be sharpened to 0(iV -1 / 4 ) by using instead of Lemma 
5.3 a trick, attributed to Trotter, that we learned from Talagrand's paper [Tl] (see the proof of 
Proposition 6.3 in that paper). 

We need of course a recursion for Tjv as well. From here on there is no great difference from 
the procedure in [PST], except that the iV-dependences have to be kept track of carefully. This 
was outlined in [BG4] and we repeat the steps for the convenience of the reader. To simplify the 
notation, we ignore all the 0(N~ 1 / 4 ) error terms and put them back in the end only. Also, the 
remarks concerning (3 and a made above apply throughout. 

Note that T N = \\IE$, N Z\\l - (JE$, N Z X ) 2 and 



M / N ^ 

E m JV+T E £ViV+l,M (<7i) 

. \ m (mSjv+i.m^o)) 



M 



M / N 

E^o/4S+i,mM j^izZ^^ 

M=l V i=l 



7V+1 



,Af(^i) 



Using Lemma 5.2 as in the step leading to (6.7), we get for the first term in (6.10) 
m (/^SIi.mM) = iEtanh 2 ^( m m N Z 1 + y/lET N j) = IEQ N 
For the second term, we use the identity from [PST] 



M 



N 



fi=l V i=l 

=rl E T=±1 ^'(r)e-W 



Together with Lemma 6.2 one concludes that in law up to small errors 
m / jv \ 

E # TfT7 E CiW+i.Affo) = + V^Xn 

M=l V i=l / 

/?IE $ J|Z||2tanh/? (^iE^Zx + ^/lET N X N ^ 



and so 



+ 



iE||iE$ Ar+1 Z||^ = qlIEQn + IE 



tanh/3 (^iBft^Zi + ^/IET N X N ^ 
+ (31ElE !S>N \\Z\\ 2 2 tanh 2 (3 ^qIE^ n Z 1 + ^IET N X N ^j 



(6.10) 



(6.11) 



(6.12) 



(6.13) 



(6.14) 
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Using the self-averaging properties of iE^H-Z^, the last term is of course essentially equal to 



PIEIE^ N \\Z\\ 2 2 IEQ N 



(6.15) 



The appearance of iE$ N ||Z||2 is disturbing, as it introduces a new quantity into the system. For- 
tunately, it is the last one. The point is that proceeding as above, we can show that 



IEIE^ N+1 1| Z\\ I =a + IE 



tanh/3 (& +1 2E*„Zi + ^IET N X N ^ 



&!E*„Z 1 + y/lET N X. 



N 



(6.16) 



+ (iIEIE^ N \\Z\\llEQ 



N 



so that setting Un = IE<j> N H-Z^ 2 ,, we S e t) subtracting (6.14) from (6.16), the simple recursion 



IEU N+1 = a(l - IEQ N ) + 0(1 - IEQ N )IEU N 



(6.17) 



From this we get (since all quantities considered are self-averaging, we drop the IE to simplify the 
notation), setting rrii(N) = IE$ N Z\, 



Tn+i — -(mi(JV + l)) 2 + aQ N + 0U N Q 



N 



+ J <W(g)[m 1 (N) + y^g] tanh/3(mi(iV) + y^g) 

= mi (N + l)(mi(JV) - m^N + 1)) + 0U n Qn + 0T N (l - Q N ) + aQ N 



(6.18) 



where we used integration by parts. The complete system of recursion relations can thus be written 

as 

mi(JV + l) = J dAA( 5 )tanh/3 (m^N) + y 7 ^) + 0(N~ 1 ^) 

T N+1 = mi {N - l)(mi(iV) - mi {N + 1)) + 0U n Qn + 0T N (1 - Q N ) + aQ N + 0{N~^) 

U N+1 = a(l - Qat) + 0(1 - Qn)U n + 0{N- 1 / 4 ) 

Qn+i = f dM(g) tanh 2 (m^N) + v^V 5 ) + C^iV" 1 / 4 ) 

(6.19) 

If the solutions to this system of equations converges, than the limits r = limjvjoo T^/a, q = 
liniArfoo Qn and mi = limjvioo mi (TV) (u = limjvjoo Un can be eliminated) must satisfy the equa- 
tions 

mx = J dj\f(g)tanh(0(m 1 + y/arg)) (6.20) 



q = J dAA(5f)tanh 2 (/3(mi + y/arg)) 



r = 



(l-(5 + 0qY 
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(6.21) 
(6.22) 



which are the equations for the replica symmetric solution of the Hopfield model found by Amit et 
al. [AGS]. 

In principle one might think that to prove convergence it is enough to study the stability of 
the dynamical system above without the error terms. However, this is not quite true. Note that 
the parameters (5 and a of the quantities on the two sides of the equation differ slightly (although 
this is suppressed in the notation). In particular, if we iterate too often, a will tend to zero. The 
way out of this difficulty was proposed by Talagrand [Tl]. We will briefly explain his idea. In 
a simplified notation, we are in the following situation: We have a sequence X n (p) of functions 
depending on a parameter p. There is an explicit sequence p n , satisfying \p n +i — Pn\ < c/n and a 
functions F p such that 

X n+1 ( Pn+1 ) = F Pn (X n ( Pn )) + Oin- 1 / 4 ) (6.23) 
In this setting, we have the following lemma. 

Lemma 6.3: Assume that there exist a domain D containing a single fixed point X*{jp) of F p . 
Assume that F P (X) is Lipshitz continuous as a function of X, Lipshitz continuous as a function of 
p uniformly for X £ D and that for all X G D, F£(X) — ► X*(p). Assume we know that for all n 
large enough, X n (p) £ D. Then 

limX n (p)=X*(p) (6.24) 



Proof: Let us choose a integer valued monotone increasing function k(n) such that k(n) f oo as n 
goes to infinity. Assume e.g. k(n) < Inn. We will show that 

\\mX n+k{n) (p)=X*{p) (6.25) 

njoo 

To see this, note first that \p n+ k(n) ~ Pn\ < ^r~- By (6.23), we have that using the Lipshitz 
properties of F 

X n+Hn) (p) = F^ n \X n ( Pn )) + Oin- 1 / 4 ) (6.26) 

where we choose p n such that p n +k(n) = P- Now since X n (p n ) G D, F p ( ~ n \x n (p n ) — X*(p) j as 
n and thus k(n) goes to infinity, so that (6.26) implies (6.25). But (6.25) for any slowly diverging 
function k(n) implies the convergence of X n (p), as claimed. 



This lemma can be applied to the recurrence (6.18). The main point to check is whether 
the corresponding Fp attracts a domain in which the parameters mi(A r ),T/v ,Un ,Qn are a priori 
located due tho the support properties of the measure p - This stability analysis was carried 
out (for an equivalent system) by Talagrand and answered to the affirmative. We do not want to 
repeat this tedious, but in principle elementary computation here. 
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We would like to make, however, some remarks. It is clear that if we consider conditional 
measures, then we can always force the parameters mi(N), Rn,Un,Qn to be in some domain. 
Thus, in principle, we could first study the fixpoints of (6.18), determine their domains of attraction 
and then define corresponding conditional Gibbs measures. However, these measures may then be 
metastable. Also, of course, at least in our derivation, do we need to verify the local convexity in 
the corresponding domains since this was used in the derivation of the equations (6.18). 
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